Several concerns about D11 usage must be addressed by a real implementation. D11 requests are executed synchronously. A multiprocessor workstation could potentially achieve a performance benefit from the current client/server implementation of X11 because an X client and the X server could run in parallel on different processors. Potentially, D11's lower operating system overhead could make up for the advantage of X11's client/server parallelism.
A given implementation of protected procedure calls may not be fast enough. Xlib's buffering of protocol allows the operating system overhead to be amortized across several Xlib calls. If protected procedure calls are too expensive to be done per D11 request, a similar buffering scheme may be necessary for D11 requests. This reintroduces the cost of protocol packing and unpacking but retains the other performance benefits of D11.
An APP per window system connection may be rather expensive in terms of the APP's associated kernel data structures. The cost of APPs should be minimized.